Investigations on inter-speaker variability in the feature space
نویسنده
چکیده
We apply Fisher variate analysis to measure the e ectiveness of speaker normalization techniques. A trace criterion, which measures the ratio of the variations due to di erent phonemes compared to variations due to di erent speakers, serves as a rst assessment of a feature set without the need for recognition experiments. By using this measure and by recognition experiments we demonstrate that cepstral mean normalization also has a speaker normalization e ect, in addition to the well-known channel normalization e ect. Similarly vocal tract normalization (VTN) is shown to remove inter-speaker variability. For VTN we show that normalization on a per sentence basis performs better than normalization on a per speaker basis. Recognition results are given on Wallstreet Journal and Hub-4 databases.
منابع مشابه
Eliminating Inter-speaker Variability Prior to Discriminant Transforms
This paper shows the impact of speaker normalization techniques such as vocal tract length normalization (VTLN) and speaker-adaptive training (SAT) prior to discriminant feature space transforms, such as LDA. We demonstrate that removing the inter-speaker variability by using speaker compensation methods results in improved discrimination as measured by the LDA eigenvalues and also in improved ...
متن کاملHyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations
The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...
متن کاملVariability of Acoustic Features of Hypernasality and it’s Assessment
Hypernasality (HP) is observed across voiced phonemes uttered by Cleft-Palate (CP) speakers with defective velopharyngeal (VP) opening. HP assessment using signal processing technique is challenging due to the variability of acoustic features across various conditions such as speakers, speaking style, speaking rate, severity of HP etc. Most of the study for hypernasality (HP) assessment is base...
متن کاملModeling intra-speaker variability for speaker recognition
In this paper we present a speaker recognition algorithm that models explicitly intra-speaker inter-session variability. Such variability may be caused by changing speaker characteristics (mood, fatigue, etc.), channel variability or noise variability. We define a session-space in which each session (either train or test session) is a vector. We then calculate a rotation of the session-space fo...
متن کاملTrainable speaker diarization
This paper presents a novel framework for speaker diarization. We explicitly model intra-speaker inter-segment variability using a speaker-labeled training corpus and use this modeling to assess the speaker similarity between speech segments. Modeling is done by embedding segments into a segment-space using kernel-PCA, followed by explicit modeling of speaker variability in the segment-space. O...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999